Thrombosis-related circulating miR-16-5p is associated with disease severity in patients hospitalised for COVID-19

ABSTRACT SARS-CoV-2 tropism for the ACE2 receptor, along with the multifaceted inflammatory reaction, is likely to drive the generalized hypercoagulable and thrombotic state seen in patients with COVID-19. Using the original bioinformatic workflow and network medicine approaches we reanalysed four coronavirus-related expression datasets and performed co-expression analysis focused on thrombosis and ACE2 related genes. We identified microRNAs (miRNAs) which play role in ACE2-related thrombosis in coronavirus infection and further, we validated the expressions of precisely selected miRNAs-related to thrombosis (miR-16-5p, miR-27a-3p, let-7b-5p and miR-155-5p) in 79 hospitalized COVID-19 patients and 32 healthy volunteers by qRT-PCR. Consequently, we aimed to unravel whether bioinformatic prioritization could guide selection of miRNAs with a potential of diagnostic and prognostic biomarkers associated with disease severity in patients hospitalized for COVID-19. In bioinformatic analysis, we identified EGFR, HSP90AA1, APP, TP53, PTEN, UBC, FN1, ELAVL1 and CALM1 as regulatory genes which could play a pivotal role in COVID-19 related thrombosis. We also found miR-16-5p, miR-27a-3p, let-7b-5p and miR-155-5p as regulators in the coagulation and thrombosis process. In silico predictions were further confirmed in patients hospitalized for COVID-19. The expression levels of miR-16-5p and let-7b in COVID-19 patients were lower at baseline, 7-days and 21-day after admission compared to the healthy controls (p < 0.0001 for all time points for both miRNAs). The expression levels of miR-27a-3p and miR-155-5p in COVID-19 patients were higher at day 21 compared to the healthy controls (p = 0.007 and p < 0.001, respectively). A low baseline miR-16-5p expression presents predictive utility in assessment of the hospital length of stay or death in follow-up as a composite endpoint (AUC:0.810, 95% CI, 0.71–0.91, p < 0.0001) and low baseline expression of miR-16-5p and diabetes mellitus are independent predictors of increased length of stay or death according to a multivariate analysis (OR: 9.417; 95% CI, 2.647–33.506; p = 0.0005 and OR: 6.257; 95% CI, 1.049–37.316; p = 0.044, respectively). This study enabled us to better characterize changes in gene expression and signalling pathways related to hypercoagulable and thrombotic conditions in COVID-19. In this study we identified and validated miRNAs which could serve as novel, thrombosis-related predictive biomarkers of the COVID-19 complications, and can be used for early stratification of patients and prediction of severity of infection development in an individual.Abbreviations: ACE2, angiotensin-converting enzyme 2AF, atrial fibrillationAPP, Amyloid Beta Precursor ProteinaPTT, activated partial thromboplastin timeAUC, Area under the curveAβ, amyloid betaBMI, body mass indexCAD, coronary artery diseaseCALM1, Calmodulin 1 geneCaM, calmodulinCCND1, Cyclin D1CI, confidence intervalCOPD, chronic obstructive pulmonary diseaseCOVID-19, Coronavirus disease 2019CRP, C-reactive proteinCV, CardiovascularCVDs, cardiovascular diseasesDE, differentially expressedDM, diabetes mellitusEGFR, Epithelial growth factor receptorELAVL1, ELAV Like RNA Binding Protein 1FLNA, Filamin AFN1, Fibronectin 1GEO, Gene Expression OmnibushiPSC-CMs, Human induced pluripotent stem cell-derived cardiomyocytesHSP90AA1, Heat Shock Protein 90 Alpha Family Class A Member 1Hsp90α, heat shock protein 90αICU, intensive care unitIL, interleukinIQR, interquartile rangelncRNAs, long non-coding RNAsMI, myocardial infarctionMiRNA, MiR, microRNAmRNA, messenger RNAncRNA, non-coding RNANERI, network-medicine based integrative approachNF-kB, nuclear factor kappa-light-chain-enhancer of activated B cellsNPV, negative predictive valueNXF, nuclear export factorPBMCs, Peripheral blood mononuclear cellsPCT, procalcitoninPPI, Protein-protein interactionsPPV, positive predictive valuePTEN, phosphatase and tensin homologqPCR, quantitative polymerase chain reactionROC, receiver operating characteristicSARS-CoV-2, severe acute respiratory syndrome coronavirus 2SD, standard deviationTLR4, Toll-like receptor 4TM, thrombomodulinTP53, Tumour protein P53UBC, Ubiquitin CWBC, white blood cells


Introduction
Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and is associated with increased risk of mortality and adverse cardiovascular (CV) events especially among patients with underlying cardiovascular diseases (CVD) [1,2].
Key mechanisms which may drive the pathophysiology of multi-organ injury secondary to infection with SARS-CoV-2 include direct viral toxicity, thrombosis and inflammation, leading to thromboinflammation which is simultaneous and co-localized activation of thrombotic and inflammatory response [3,4]. Thromboinflammation is associated with dysregulation of normal anti-thrombotic and anti-inflammatory functions of endothelial cells and negatively influences haemostasis, which is especially dangerous in microvasculature, where microthrombi deposition can occur [5,6]. In line, patients with severe COVID-19 present with microvascular thrombosis and haemorrhage. This was associated with lung pathology presenting as extensive alveolar and interstitial inflammation which could be suggestive for the assessment of the case fatality [7]. Autopsy studies have revealed the presence of thromboinflammation within the pulmonary capillary vasculature. Moreover, it should be noted that patients with COVID-19 are more prone to develop pulmonary embolism, deep vein thrombosis, arterial thrombosis and intracatheter thrombosis, which is also negative for survival prognosis [8,9].
Early in the pandemic, hospitalized patients with COVID-19 were often observed to have changes in the levels of coagulant biomarkers, including fibrinogen, D-dimer and activated partial thromboplastin time (aPTT), and routine measurement of these biomarkers was recommended [10][11][12]. However, a number of other markers of coagulation have emerged that have helped to refine our understanding of the thrombotic signature of COVID-19 [13]. Currently there is a need for identification of biomarkers which would allow early identification of patients in risk of thrombosis. MicroRNAs (miRNAs, miRs) are a class of small, endogenous, non-coding RNA (ncRNA) molecules that have such a potential. They play an important role in many biological processes and regulate the expression of approximately 60% of the mammalian protein coding genes [14]. MiRNAs act primarily by binding to complementary regions of messenger RNA (mRNA), leading to repression of its translation or even induction of its degradation [15]. Various miRNAs, as well as their target genes are implicated in the complex pathophysiology of coagulation and CVDs [16]. Thus, they may be useful for diagnosis, prognosis and as a potential therapeutic strategy in multiple pathologies [1,15,[17][18][19][20][21][22][23][24][25]. There is also growing evidence from epidemiological studies and animal models suggesting that the expression level of miRNAs not only regulates coagulation and haemostatic factors but is dysregulated in venous thromboembolism (VTE) [26]. One of the case-control studies identified 9 differentially expressed (DE) miRNAs (hsa-miR-4451, hsa-miR-942-3p, hsa-miR-8063, hsa-miR-3132, hsa-miR-3118, hsa-miR-105-5p, hsa-miR-891a-5p, hsa-miR-200a-5p, and hsa-miR-6832-3p) in the blood of colorectal cancer patients who developed VTE compared to controls in this pilot study [27]. Other diagnostic biomarkers for PE include miR-134 [28], miR-1233 [29] and miR-28-3p [30]; while diagnostic biomarkers for DVT include miR-582, miR-195, miR-532 [31], miR-424-5p, miR-136-5p [32], and miR-320a, miR-320b [33]. Taken together, due to their stability and regulation of a high number of disease-related genes miRNAs may be relevant not only as biomarkers for different kinds of thrombosis but could also be involved in the pathogenesis of thrombosis and, therefore, be potential therapeutic targets.
Implementation of bioinformatics tools can reveal interactions between genes and their non-coding regulators, i.e. miRNAs, which may help to understand the pathomechanisms of disease development [1,34]. A biggest advantage of network medicine and systems biology approach is the ability to identify novel, key regulators of the pathological processes including investigation of the COVID-19 related changes in coagulation processes. Barabási et al. summarized a series of hypotheses and principles (Network Medicine Hypotheses) which link topological properties of Protein-Protein Interaction (PPI) networks to biological processes [35]. Those hypotheses are often used to prioritize candidate genes associated with the analysed disease. We highlight three hypotheses (i) disease module hypothesis: gene products associated with the same disease phenotype tend to form a cluster in the PPI network; (ii) network parsimony: disease pathways often coincide with shortest paths between known disease genes; (iii) local hypothesis: gene products associated with similar diseases are likely to strongly interact with each other [35]. In this way, we used an innovative NERI algorithm [36] which integrates the interactome data with co-expression networks, enabling us to focus on the signalling cascade between ACE2 and genes associated with thrombosis and coagulation-related processes. By assuming some Network Medicine hypothesis (such as: network parsimony and disease modularity), the method explored the neighbourhood of a gene set (seeds) by choosing the smallest paths possessing more coexpressed genes with the seeds. As output, the method returns two subnetworks (modules): control and disease subnetworks, for which the method ranks genes and interactions (edges) according to their topological importances in both subnetworks (score X) or to their topological alterations between the subnetworks (score Δ', also called S internally by the algorithm implementation). This approach enabled us to discover a cluster of party hub genes [37,38], also called 'bottleneck regulators', with corroborating signals across transcript expression and protein-protein interaction data, causing to pathological alterations in coagulation processes even when such regulators show little or no changes in expression between control and disease conditions.
MiRNAs have been studied as diagnostic and prognostic biomarkers for many diseases including myocardial infarction, liver failure, sepsis, or ischaemic stroke [39,40]. Whilst protein-based biomarkers for COVID-19 have been highly studied [41,42], however, limited number of studies aimed to analyse preselected circulating miRNAs in COVID-19 patients thus far [43][44][45]. Lately, the strong association of cardiometabolic miRNAs with COVID-19 severity and mortality has been shown and combinations of miRNAs improved classification performance of established markers for severity and mortality of COVID-19 [45].
In line with previous publications, in the current study, we have precisely selected analysed miRNAs based on bioinformatic analysis including the top thrombosis and coagulation-related miRNA (miR-16, let-7b, miR-27a, miR-155) and aimed to unravel whether bioinformatic prioritization could guide selection of miRNAs with a potential as diagnostic and prognostic biomarkers associated with disease severity in patients hospitalized for COVID-19.

Gene lists selection
We downloaded the following lists of genes: 98 genes associated with thrombosis term from Disgenet database (https://www.disge net.org/browser/0/1/0/C0040053/); 62 genes associated with thrombosis term from Malacards database; 224 genes related to coagulation and 341 genes related to platelet activity extracted by BiomartR R package from Gene Ontology database. The 68 genes involved in the ACE2 network were obtained from a previous publication [1]. The gene symbols were unified using the NCBI annotation file.

Tissue specific expression
In order to identify genes with expression in the CV system, we mined Tissues 2.0 database [46]. For further PPI interactions analyses leading to seed gene selection we selected genes with expression confidence cut-off at least 2 in Blood, Heart, Pericyte or CV system. For pre-selection of miRNAs we also used the ones expressed with high confidence in blood and above mentioned tissues.

Expression datasets analysis
We downloaded three expression datasets from the Gene Expression Omnibus (GEO) database: 1) Peripheral blood mononuclear cells (PBMCs) isolated from SARS patients and controls (GSE1739) [49]; 2) Human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) infected with SARS-CoV2 vs control cultures (GSE150392) [50]; 3) Autopsies of SARS-CoV-2 infected tissues from which we selected datasets related to Lungs, Hearts and control samples (GSE150316) [51]. We performed differential expression analysis for each dataset using a linear regression model (GSE1739, GSE150316) or Mann Whitney test (GSE150392). As a differentially expressed (DE) gene we selected those which had FDR adjusted p-value <0.05.

PPI network analysis with NERI method
To identify top genes which could play a role in COVID-19related thrombosis but are not necessarily DE we performed co-expression analysis using network-medicine based integrative approach (NERI) algorithm [36] and three sets of seed genes. The selection of the seeds is described in the results section (Selection of the ACE-2 related coagulation seed genes). The NERI method returns a subnetwork representing the neighbourhood of a gene set by selecting the shortest paths between all pairs of seeds with the most coexpressed genes along the paths. This method is independently applied for two conditions (control and disease), thus obtaining the control and disease subnetworks (modules). The method outputs two scores X and Δ' (S) regarding the topological analysis of the two resulting subnetworks (control and disease). The first one (X) prioritizes genes with party hub features [37,38] in both modules, possessing high topological centrality and, at the same time, high co-expression relative to the seed genes. The second one (Δ') prioritizes the most topologically altered genes and edges between the two conditions [36]. Each analysis provided 100 best nodes and 500 best edges. In further steps for constructing the visualization of the interaction networks: i) we selected genes connected by best edges and best edges associated with best nodes; ii) genes which appeared in at least 5 out of 8 analysed datasets using coagulation related seeds.

Selection of the top genes based on co-expression network analysis using NERI
In order to identify the top genes which could serve as signalling 'bottleneck' in COVID-19 we performed interaction analysis using four different expression datasets: PBMC from COVID-19 patients, hiPSC-CMs infected with SARS-CoV2, heart and lung tissues from deceased COVID-19 patients. Each dataset was analysed using NERI algorithm [36], which integrated protein-protein interactome data with expression data using so-called seed-genes, enabling us to focus on the part of the network related to: (i) ACE2 interactome, (ii) interaction of ACE2-related genes with coagulation-related genes, (iii) coagulation related genes with high proximity to ACE2 interactome. The algorithm ranked genes based on their role as hub genes connecting multiple signalling pathways and disease-related changes of expression in neighbouring genes with correlated expression changes. In a further step, we overlapped all 12 outputs (4 expression datasets, 3 seeds each) to identify clusters of the top genes and interactions regulated by coronaviruses. For further analyses we used only results for the following sets of seeds: (i) interaction of ACE2-related genes connected coagulation-related genes; (ii) coagulation related genes with high proximity to ACE2 interactome. Simultaneously, we performed enrichment analysis to verify whether coagulation-related processes were also affected by this virus and ACE2 signalling. We also performed visualization of the interaction network between top genes. In the final step, we identified the top microRNAs (miRNA) which targeted the highest number of the top genes as well as the highest number of DE genes (affected by the disease) across analysed datasets.

miRNA target prediction:
On all steps of bioinformatic analyses we used the R package 'wizbionet' https://github.com/wizbionet/wizbionet/blob/mas ter/doc/vignette_wizbionet.md [52]. To identify miRNAs regulating DE genes we used a topmiRNA_toptarget function based on the multiMiR R package [53]. We looked for the −3p and −5p mature version of each miRNA identified by multiMiR and their targets among DE genes and coagulation/ thrombosis related genes. Then we searched the top 10% hits among all conserved and non-conserved target sites in 14 target prediction databases.

Enrichment analysis
Enrichment analysis is a computational method for increasing the likelihood to identify the most significant biological processes related to the study. Enrichment analysis of the networks was done using StringApp and EnrichR database [54], using the Hypergeometric test with Benjamini and Hochberg correction, while the reference was the human genome. For all enrichment analyses, the significance cut-off was set to BH adjusted p-value ≤0.05.

Ethics statement
The study was conducted with the Declaration of Helsinki [55]. The study protocol was approved by an ethical committee. Written informed consent form was obtained from all participants, or their legal representatives.

Study group
This was an observational study which included 79 COVID-19 patients admitted to the Clinic of Internal Medicine, Pneumonology, Allergology and Clinical Immunology at the Military Institute of Medicine in Warsaw. Patients aged 18 years or older with a positive nasopharyngeal swab PCR test for SARS-CoV-2 were recruited during the third pandemic wave in Poland (January-May 2021), who were admitted to the hospital based on clinical presentation including severe dyspnoea requiring oxygen therapy and monitoring on intensive care unit (ICU) for at least 24 hours. Blood samples of COVID-19 patients were collected at three different time points, including the day of admission, 7-days and 21-days after admission. The patients were divided according to the hospitalization length of stay (>21 days) and/or death in follow-up as a composite endpoint. Thirty two age and sex matched SARS-CoV-2 infection-free participants were sampled in out-patients clinic.
Comprehensive demographic, clinical, pharmacological and laboratory data were abstracted manually from the electronic medical records. All blood samples were collected as whole blood by using Tempus™ Blood RNA Tube (Applied Biosystems). All specimens were immediately aliquoted, frozen and stored at −80°C. Frozen sample was thawed once only to prevent repeated freeze-thaw cycles.

RNA preparation, detection, and quantification of miRNAs by quantitative PCR
Total RNA was extracted using Tempus™ Spin RNA Isolation Kit (Invitrogen) from 9 ml whole blood. Subsequently, the obtained RNA template was subjected to a reverse transcription reaction using the TaqMan miRNA Reverse Transcription kit (ABI, California, USA) according to guidelines provided by the manufacturer. Afterwards, miRNA expressions were detected by quantitative polymerase chain reaction (qPCR) using TaqMan miRNA Advanced Assay kits (ABI, California, USA, catalogue number A25576, assay ID; 477860_mir; 478576_mir; 478384_mir; 477927_mir) for the corresponding miRNAs on a CFX384 Touch Real-Time PCR Detection System (BioRad Inc. Hercules, California, USA). During the RNA extraction phase cel-miR-39 was spiked-in as an exogenous control and all qRT-PCRs were normalized to their corresponding cel-miR-39. Mean values of all reactions -performed in triplicate -were used in statistical analysis as previously described [56,57]. MiRNA expressions were expressed as 2-ΔCT [miRNA expression -cel-miR-39 expression] [56,58-60], then log-10 transformed for statistical analysis.

Statistical analysis
All results for categorical variables were presented as a number and percentage. Continuous variables were expressed as mean ± standard deviation (SD) or median and interquartile range (IQR), depending on the normality of distribution assessed by means of a Shapiro-Wilk test. The Student t-test or the Mann-Whitney test for unpaired samples, and Wilcoxon test for paired samples, were applied depending on the normality of the distribution. One-way Anova or Kruskal-Wallis tests were used for more than two groups comparison based on the distribution of the data. Receiver operating characteristic (ROC) analysis was performed to assess the predictive value of miR-16-5p for increased hospital length of stay or death in follow-up as a composite endpoint. To determine independent variables affecting increased hospital length of stay or death we applied multivariate logistic regression analysis. For this purpose, we included the following variables: low miR-16-5p (>6.1 expression based on Log-transformed data), age (years), male sex, BMI >30, hypertension, diabetes mellitus, smoking and coronary vascular disease. All tests were twosided with the significance level of p < 0.05. Calculations were performed using SPSS version 22.0 (IBM Corporation, Chicago, USA). Based on a 76% increased hospitalization in the low baseline miR-16-5p group as compared to 24% increased hospitalization in the high baseline miR-16-5p group, we calculated that with 79 patients, our analysis had 99% power with a two-sided alpha value of < 0.05.

Bioinformatic analysis results
In this study we aimed to, based on co-expression analysis using NERI algorithm and set seed genes, to identify clusters of hub genes [37,38]. Those genes, also called 'bottleneck regulators' through corroborating signals across transcript expression and protein-protein interaction data, would lead to pathological changes in coagulation processes even when such regulators show little or no changes in expression between control and disease conditions. Then based on those results we identified top non-coding regulators of the interaction between ACE2 interaction networks which could be affected by SARS-CoV-2 infection and coagulation process. Identified in this study miRNAs were shared regulators of ACE2 network and coagulationrelated genes between PBMCs, heart, lungs and cardiomyocytes infected with coronaviruses. The workflow of the analysis is shown on the (Fig. 1A,B).

Identification of the ACE2 related coagulation seed genes
In order to select the best seed genes for further coexpression analysis using NERI we used 69 genes identified in our previous study as involved in ACE2 interaction network [1] and further extended it for genes related to thrombosis and coagulation. For each gene we calculated the so-called coagulation score based on its expression in the CV system, presence in expression datasets, and presence in disease-related gene lists (thrombosis term in Malacards/DisgeNet); and finally its connection with the ACE2 network. For further analyses we selected three genesets: i) ACE2-68 genes interacting with ACE2; ii) ACE2 and coagulation-53 genes with high coagulation score and high connectivity with ACE2 network; iii) Coagulation-40 genes with high coagulation score (≥7) with high proximity to ACE2 interactome. All genes used in seed selection analysis are shown in the Supplementary  Figure 1 and are available as a Cytoscape network file (Supplementary file 1).

Top coagulation-related genes affected by coronavirus infections
Coexpression analysis using NERI across all analysed datasets and seeds, allowed us to found several top genes which could play a pivotal role in COVID-19 related thrombosis, such as EGFR, HSP90AA1, APP, TP53, PTEN, UBC, FN1, ELAVL1 and CALM1 (Fig. 2)  The detailed workflow of top genes selection is described in the Methods section. Among the top genes identified by enrichment analysis as playing a key role in signalling pathways associated with coagulation-related interaction networks in COVID-19, we identified CALM1 and enhancement of its connection with PTEN and UBC (Fig. 3A,B).

Pathway enrichment analysis identified CALM1 as the most potent regulatory gene
In order to identify thrombosis-related pathways affected by coronavirus infection we performed enrichment analysis of the top genes. This analysis revealed that the highest number of the top genes was associated with pathways related to immune system and platelet activation signalling and aggregation. We also observed a strong enrichment of disease, signalling by interleukins and cytokines (Fig. 3A). CALM1 was identified in this analysis as a gene connected to the highest number of significantly overrepresented pathways. Closer analysis of its interaction network showed that PTEN and its interaction with CALM1 was stronger in control samples than in coronavirus samples, while interactions with ELAVL1, EGFR and UBC were stronger in the disease (Fig. 3B).

Identification of the miRNAs regulating ACE2 and coagulation related interaction networks and DE genes affected in COVID-19
In order to identify miRNAs which play a role in ACE2related thrombosis in SARS-CoV2 infection we screened four lists of DE genes (SARS, cardiomyocytes, heart and lungs), as well as DE coagulation genes from those datasets. Next, we looked for miRNAs regulating the highest number of top NERI nodes and top NERI targets associated with coagulation. Top miRNAs were defined as regulating the top 30% of genes within each category (all DE genes, coagulation DE genes, top NERI nodes, top NERI nodes associated with ACE2 related coagulation). We performed target predictions for the 1416 miRNAs showing any expression confidence in the cardiovascular system according to the Tissues 2.0 database. In this analysis we identified 34 pre-miRNAs which targeted the highest number of 'bottleneck regulators' associated with coagulation shared across four COVID-19 related datasets, and the highest number of DE genes in those datasets (Table 1). Additionally in this table we listed which coagulation-related genes with high scores in NERI analysis, including 13 coagulation genes from Fig. 1, are regulated by the top miRNAs. Among the identified miRNAs that regulated ACE2-coagulation related interaction networks, we have found 8 pre-miRNAs, namely miR-16, miR-27a, miR-34a, let-7b, miR-155, miR-23b, miR-374a and miR-128, that shared the highest number of the top genes identified using NERI and DE genes related to coronavirus infection. In our previous in silico prediction analysis we found miR-16-5p and miR-27a-3p as miRNAs regulating ACE2 networks [1]. Moreover another in silico analysis by Jafarinejad-Farsangi et al. [61] found miR-16-5p and let-7b as targeting SARS-CoV-2 induced differentially expressed genes and miR-155 was predicted related to SARS-CoV-2-induced cytokine storm [62]. Therefore, out of eight miRNAs that were found in the current analysis, we selected particularly four the most promising miRNAs (i.e. miR-27a, miR-16, let-7b and miR-155), and further validated their expression by qRT-PCR in patients hospitalized due to COVID-19.

Participants
Patient characteristics are presented in Tables 2,3. Cardiovascular disease includes coronary artery disease, myocardial infarction and atrial fibrillation. We did not observe any differences between healthy individuals and COVID-19 Figure 1. A) Bioinformatic workflow of the ACE2 and thrombosis-related seed genes selection for the integration of PPI interactome data with coexpression networks by NERI algorithm. This analysis enabled us to focus on a specific part of the network in this case: gradual signalling cascade between ACE2 and genes associated with coagulation-related processes. Seed genes are marked in green. Nodes outside of the ACE2 network were sorted using group circular layout based on the number of occurrences on four thrombosis and coagulation-related gene lists. Node colours were related to so-called 'coagulation score' calculated based on the occurrences on four gene lists and expression in four cardiovascular-system related tissues. B) Bioinformatic workflow leading to the identification of the miRNAs regulating the coagulation process in coronavirus infection.
patients with regard to basic demographic data including sex, age and body mass index (BMI) (p = 0.254, p = 0.707, p = 0.421, respectively). Among the 79 patients included, 9 (11.4%) patients, who were admitted to intensive care unit (ICU), died within the median of 11 days.
Patients' laboratory data at 3 different time-points are presented in Table 3. There was a significant difference in WBC count (p = 0.028), increase in lymphocyte count and iron levels (p = 0.015, p = 0.001 respectively), and a decrease in CRP levels (p < 0.001) over time. Fig. 4 shows the comparison of circulating miRNAs relative expression between healthy individuals and COVID-19 patients at 3 different timepoints (at admission, 7-days after admission, and 21-days after admission). The expression levels of miR-16-5p and let-7b in COVID-19 patients were lower at baseline, 7-days and 21-day after admission compared to the healthy controls (p < 0.0001 for all time points for both miRNAs). The expression levels of miR-27a-3p and miR-155-5p in COVID-19 patients were higher at day 21 compared to the healthy controls (p = 0.007 and p < 0.001, respectively). In COVID-19 patients, miR-27a-3p and miR-155-5p expressions increased over time. MiR-155-5p expression levels differed between healthy individuals and COVID-19 patients (p = 0.009).

Low baseline expression of miR-16-5p in patients with COVID-19 is associated with an increased hospital length of stay
Patients hospitalized over a duration of 21 days had significantly lower expression levels of miR-16-5p when compared to those hospitalized under 21 days (p < 0.0001) (Fig. 5A,C). According to the ROC curve analysis, a low baseline miR-16-5p expression presents predictive utility in assessment of the hospital length of stay (AUC:0.815, p < 0.0001) (Fig. 5B). Similar results were found for the comparison of hospital length of stay or death in follow-up as a composite endpoint (AUC:0.810, 95% CI, 0.71-0.91, p < 0.0001) (Fig. 5D). No significant findings were observed according to other miRNAs The edge size is related to the number of occurrences in the top 30% of strongest regulated edges across analysed datasets (SARS, cardiomyocytes, lungs, heart) and two types of seeds 'ACE2 and coagulation' and 'coagulation' (8 gene lists in total). The node and edge sizes are associated with the importance in signal flow within the network for the analysed expression datasets (12 in total), which does not necessarily relate to upregulation in expression. If node/edge was enhanced according to NERI in higher number of disease-related datasets, it has red colour and has high importance in disease, if in control-related dataset, it is green and has decreased importance in disease. Grey colour means disease or control rate was the same in all datasets. The node size is associated with the sum of Δ' (S) scores obtained from NERI. The higher the S score, the more important the role of a given gene in the analysed network. To show the tissue-specific expression level of each gene we added bars on each node reflecting their expression confidence (0-5) using the data obtained from the Tissues 2.0 database. EGFR, ELAVL1 and APP were identified across all 12 analysed gene lists (including three sets of seeds) as the top regulators of the thrombosis-related networks. DE genes in expression datasets have blue borders. whose expression levels were measured in our cohort (miR-27a-3p, let-7b-5p and miR-155-5p) (Supplementary Figure 2).
The COVID-19 patients group was divided into two subgroups by using ROC curve analysis (based on hospital length of stay >21 days and/or death) for miR-16-5p, i.e. low-or high value ( Fig. 5 and Table 4). The cut-off value of ≤ 6.1 was labelled as low miR-16-5p level (39% of the population). The sensitivity and specificity was 78% and 73% respectively for the cut off value. Similar results were found for ROC analysis based on a single endpoint of length of hospital stay > 21 days (Supplementary Table 1). All the other analysed miRNAs (let-7b-5p, miR-27a-3p, miR-155-5p) did not present statistically significant predictive utility for increased hospitalization based on ROC curve analysis (data not shown).

Discussion
The aim of our study was to elucidate the expression pattern of circulating miRNAs associated with COVID-19 related thrombosis based on a bioinformatic analysis. We also compared the expressions of selected miRNAs between COVID-19 patients and healthy volunteers and monitored circulating miRNA expression patterns during the acute phase of COVID-19 disease, as well as the prognostic potential of these miRNAs as biomarkers. Among the possible strategies to select miRNAs for validation studies, we performed in silico bioinformatic analysis incorporating network medicine approaches. Such an approach allows us to summarize all available evidence regarding analysed biological processes and genes to generate predictions of specific targets and their molecular interactions.
The main findings of our study are: (i) based on in silico bioinformatic analysis the top miRNAs targeting thrombosisrelated DE genes in response to SARS-CoV-2 infection were miR-16-5p, miR-27a-3p, let-7b-5p and miR-155-5p; (ii) the expression of miR-16-5p, miR-27a-3p and miR-155-5p increased during observation, compared to the baseline measurement; (iii) early expression changes were observed only for miR-155-5p, let-7b-5p and miR-16-5p when comparing to healthy controls; (iv) lower baseline expression of miR-16-5p and DM were independent predictors of increased length of stay or death according to a multivariate analysis.

Identification of the top genes related to thrombotic events associated with COVID-19
Analysis of ACE2 and coagulation interaction networks using NERI algorithm in four coronavirus infection-related expression datasets obtained from GEO database discovered multiple hub genes with corrupted signalling which can be responsible for thrombosis in COVID-19 patients. The most affected genes were EGFR, APP, HSP90AA1, TP53, PTEN, UBC, FN1, ELAVL1 and CALM1 (Figs. 2,3). Among the top

. A) Top 25 enriched pathways associated with top nodes identified by NERI algorithm as important in coagulation networks in SARS and SARS-CoV-2 infections.
Grey dots indicate whether a gene is present within significantly enriched signalling pathways. It is worth noticing that CALM1 was present in the highest number of signalling pathways as an important ACE2 interactor. The top interactors with highest number of associated pathways were ordered in decreasing manner from left to right. B) Alterations between CALM1 interactors in COVID-19. Green colour is associated with loss of importance in disease which can lead to switching the signalling onto neighbouring nodes. Red colour is associated with increased importance of the node/edge in disease, leading to increase of signalling in a given part of the network.   hsa-miR-9-3p|hsa-miR genes playing a role in coagulation-related interaction networks in COVID-19, we identified CALM1 and enhancement of its connection with PTEN and UBC (Fig. 3). As we showed before, CALM1, APP, EGFR and ELAVL1 are crucial components of the ACE2 interaction network affected by SARS-Cov2 infection in pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) [1]. EGFR production has been linked to thrombosis risk and inflammatory markers, and it was previously shown that viral infections may induce EGFR signalling and promote a proinflammatory and pro-angiogenic response [63,64]. Overactive EGFR signalling also leads to increased fibrosis after SARS coronavirus infection [65]. APP (amyloid-β precursor protein gene), is a gene encoding APP protein, a precursor for amyloid beta (Aβ) peptide. Platelets, a main source of Aβ in circulation, can be activated via various factors, i.e. inflammation and viral antigens. Our recent bioinformatic analysis pointed to APP and PTEN as the top genes involved in platelet activity and most susceptible for noncoding regulation by platelet-related miRNAs [66]. Upon their activation, platelets release APP and Aβ into the circulation. Moreover, APP is cleaved by endothelial cells, creating Aβ. Aβ promotes the activation of the coagulation factor XII, which results in stimulation of thrombin generation, possibly creating a prothrombotic state [67,68]. It was shown that Aβ may play a role not only in haemostasis but also in inflammation [67,69,70]. This makes it an important finding for elucidating the COVID-19-related thrombosis but also potentially COVID-19 related neurodegeneration due to its association with Alzheimer's disease. Another DE gene from our network is HSP90AA1, which codes heat shock protein 90α (Hsp90α), inducing inflammation through activation of the NF-kB and STAT3 pathways, while NF-kB also induces the expression of Hsp90α [71]. In human cell lines infected with SARS-CoV and SARS-CoV-2, HSP90 inhibition resulted in reduction of viral replication, which suggests its involvement in infection []. Taking into account recent studies pointing out HSP90 as a potential target in COVID-19 treatment [72,73,74,75], this finding especially supports the value of presented here results.
Another top gene identified in this study was ELAVL1, which encodes HuR protein, playing a critical role in posttranscriptional regulation, splicing and suppression of thrombomodulin synthesis in interleukin-1β (IL-1β) treatment and sepsis [76]. Thrombomodulin is a critical factor to activate protein C (aPC) in mediating the anticoagulation and antiinflammation effects. Additionally, a recent study showed that there is a significant positive correlation between ELAVL1 and ACE2 in chronic obstructive pulmonary disease (COPD) cells [77]. This result especially points out the direct link between thrombosis and SARS-Cov-2 receptor ACE. CALM1, which encodes a highly conservative protein, a key calcium sensor -calmodulin (CaM), represents another interesting candidate in our study. CaM interacts with viral proteins and positively influences the propagation of rotavirus infection [78]. CaM also regulates the activation of GPVI and GPIb-IX-V platelet receptors and NOS activation, which means that it has influence on platelet and endothelial homoeostasis [79,80]. Moreover, CALM1 is also an important ACE2 interactor, and is playing a role in viral pathogenesis [77,81,82]. In our analysis we observed alteration by coronaviruses and its connection with PTEN, EGFR, FLNA, ELAVL1 and NXF involved in regulation of the platelet activity and inflammatory processes. This suggests  Data are presented as mean ± standard deviation (SD) and median and interquartile range based on the data distribution. P-value is calculated based on One-way ANOVA test for three groups comparison. p values marked with bold indicate statistically significant differences between the multiple groups <0.05. Abbreviations: BMI, body mass index; COPD, chronic obstructive pulmonary disease; hsCRP, high-sensitivity C-reactive protein; CVD, cardiovascular disease; DM, diabetes mellitus; PCT, procalcitonin; SD, standard deviation; WBC, white blood cells.
the important role of CALM1 in COVID-19 related thrombotic complications. Moreover, in enrichment analyses we found strong overrepresentation of pathways related to immune system and platelet activity pointing out that genes identified based on publicly available expression data indeed play a role in thrombosis processes induced by coronavirus infection.

Identification of the top miRNAs related to thrombotic events associated with COVID-19
The identification of the top genes related to thrombosis and ACE2 interaction network, affected by coronavirus infection, allowed us to predict the miRNAs with the biggest potential of playing a role as biomarkers in thrombotic complications associated with COVID-19. MiR-16-5p takes part in the regulation of inflammation and programmed cell death. Many studies showed the anti-inflammatory effect of miR-16 in atherosclerosis, acute lung injury and sepsis [83,84,85]. MiR-16 influences the phenotypic changes on T cells survival, differentiation, and proliferation, which are critical for antiviral responses. Limited studies aimed to analyse the predictive and prognostic value of miR-16-5p, and miR-27a in cardiovascular diseases. A previous publication showed up-regulation of miR-16 in patients with Takotsubo cardiomyopathy compared to healthy subjects [86]. Latest meta analysis assessed the prognostic utility of 19 circulating miRNAs included in four relevant articles in heart failure patients. Meta analysis showed that patients with low expression of miR-16-5p and miR-27a levels have significantly worse overall survival [87]. MiR-27a was found significantly lower in patients with atherosclerosis compared to healthy individuals, however the study included only 25 patients and 26 controls [88]. However, its impact on SARS-CoV2 infection remains unclear. Previous bioinformatic analysis revealed the ability of miR-16-5p to target genes responsible for host-SARS-CoV2 interaction [1,61]. Another possible implication of this miRNA in COVID-19 infections is its ability to influence viral entry receptor (ACE2) related networks, which was also determined by bioinformatic analyses [1,89]. However, the implications of these findings have not been deeply investigated yet. One possible explanation is that miR-16 suppresses cell cycle and prevents viral replication through decreasing CCND1 level, which is a crucial protein in G1 to S phase transition during cell cycle processes [61,90].
In line with our data, miR-16-5p represents the most abundant miRNA in the plasma, followed by miR-223-3p, let-7b-5p and miR-146a-5p in previous study [91]. Further several miRNAs, including miR-16-2-3p, DE in peripheral blood and infected tissues from COVID-19 patients when compared to healthy controls, indicate a potential role in the infection [92,93]. However miR-16-5p is downregulated in lung epithelium, but upregulated in peripheral blood [92,93]. Importantly, in another study, plasma levels of miR-16-5p were lower in critically ill patients and were negatively correlated with the total number of days of ICU stay (rho = −0.38). Besides, miR-16-5p in cluster with miR-92a-3p, miR-98-5p, miR-132-3p, miR-192-5p and miR-323a-3p showed significant decrease in patients who did not survive at the ICU, however miR-16-5p was not included in the multivariate analysis that predicted mortality during the ICU stay [43]. Also in our analysis we did not find a correlation between miR-16-5p expression and mortality, although it was significantly lower in individuals who stayed longer in hospital.
Thrombosis-related miR-27a-3p also influences ACE2 related pathways [1] and its expression was upregulated in hospitalized COVID-19 patients compared to healthy controls. We also observed a significant increase in its expression during the stay in hospital, however there was no association with in-hospital length of stay or mortality. MiR-27a-3p is DE between ward and ICU patients, was correlated with leukocytes and CRP [43], and represents a potential in regulation of inflammatory response in sepsis. MiR-27a stimulates the  synthesis of inflammatory cytokines through the NF-kB pathway, and thus IL-6 and TNF-a expression. Thus, it may be hypothesized that miR-27a promotes pulmonary inflammation and sepsis or may reflect SARS-CoV-2 mediated gastrointestinal tract infection or inflammation [94]. In a previous study which performed small RNA deep sequencing in patients with COVID-19, miR-16-5p was the most abundant regulated miRNA, followed by miR-223-3p, let-7b-5p and miR-146a-5p [91]. Moreover, previously published bioinformatic analysis also presented the interaction between let-7b-5p and TMPRSS [95]. Based on our bioinformatic prediction analysis we validated the potential of let-7b in COVID-19 patients and found that low expressions of let-7b can have diagnostic utility in patients with COVID-19. Let-7b can potentially enhance the synthesis of inflammatory cytokines, activating TLR4/NF-κB pathway and inhibiting the TLR4 expression, which in COVID-19 patients may be interesting especially in neutrophils, due to their role in sepsis development. Thus, let-7b might indirectly stimulate neutrophils recruitment [96]. Let-7b further inhibits M protein expression of the SARS-CoV-2 genome. Interestingly, let-7 not only IL-6 mRNA expression, but also significantly decreased the expression of many other SARS-CoV-2 related cytokines and chemokines including IL-1β, IL-8, CCL2, GM-CSF, TNF-α, and VEGFα. Therefore, let-7, a miRNA ubiquitously expressed in human cells, blocks SARS-CoV-2 replication by targeting S and M protein, as well as inhibits the expression of multiple inflammatory mediators [97].
As previously described miRNAs, miR-155 is also involved in inflammation. There it acts via different mechanisms by directly targeting IL13Rα1, IL-13 receptor and also contributes to IL-8 secretion. Moreover, it influences the T-cell differentiation and affects the innate immunity [72]. MiR-155 additionally regulates pathways related with the IFN superfamily, an important regulator of inflammatory response, especially in viral infections [98,99,100]. SARS-CoV-2 infection correlates with a strong increase of miR-155 expression in the infected cells and thereby allows the distinction between patients with influenza-related and COVID-19-related acute respiratory distress syndromes [101]. Recently two human studies found that miR-155 expression significantly differed between critically ill COVID-19 patients and healthy controls [102,103]. Importantly, Tacke et al. [103] did not find miR-155 as an indicator for patient's survival. However, when patients were subdivided according to their age upon admission to the ICU into those younger and older than 65 years, low miR-155 expression levels became a strong independent indicator for patient mortality [103]. In our study only miR-155-5p expression was significantly higher in COVID-19 patients compared to healthy individuals. Moreover, miR-155 expression showed an increasing trend at day-7 and day-21 after admission. Our results are in line with previous reports which stated that high miR-155 expression can be a diagnostic biomarker. Further analysis should confirm its ability to predict the severity of the disease in COVID-19 patients.

Study limitations
The main limitation of our study is the small size of the patient group and a low mortality in the included cohort. Moreover, there were no clinically relevant thrombotic events in the patient population, and thus we could not test the hypothesis whether miRNAs selected based on computational analysis may serve as predictive biomarkers of thromboinflammatory events. Third, due to the hypothesisgenerating study design, we limited our analysis to miRNAs associated with thrombosis, based on our bioinformatic analysis and publicly available expression datasets related to coronavirus infection. However, we did not perform the miRNA sequencing in the collected plasma samples, which might enable us to determine novel miRNAs with higher predictive value for COVID-19. Finally, we did not have the possibility to analyse the top genes that were predicted in relation to thrombosis from our bioinformatic analysis. Altogether, our results should be confirmed in a larger, preferably multicentre study before miRNAs and their target genes can be used as a prognostic biomarker of COVID-19 in clinical practice.

Conclusions
This study enabled us to better characterize changes in gene expression and signalling pathways related to hypercoagulable and thrombotic conditions in COVID-19. In this study we identified and validated miRNAs which could serve as novel, thrombosis-related predictive biomarkers of the COVID-19 complications, and can be used for early stratification of patients and prediction of severity of infection development in an individual. Non-coding RNAs associated with inflammation and coagulation pathways identified by bioinformatic analysis can serve as potential early biomarkers helping in identification of the pathological changes in COVID-19. Future investigations should take into account the effect of pharmacological therapies on the circulating miRNAs pattern in this emerging disease [43]. Identifying novel biomarkers and creating predictive tools may improve outcome in patients with COVID-19, and therefore has a potential for reducing disease-related, personal and economic consequences of the global pandemic. Additional studies in larger cohorts with thrombotic complications and functional Table 5. Multivariate logistic regression model for prediction of increased hospital length of stay or death in follow-up as a composite endpoint by low expression baseline of miR-16-5p along with clinical variables. approaches are warranted to validate these findings and provide further insight into the role of circulating miRNAs as biomarkers and functional mediators of COVID-19.

Author Contributions:
Writing